home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
SGI Origin & Onyx2 Patches 1998 May
/
Origin and Onyx2 System Disk Patches May 1998.img
/
relnotes
/
patchSG0002837
/
ch1.z
/
ch1
Wrap
Text File
|
1998-04-22
|
25KB
|
661 lines
- 1 -
1. _P_a_t_c_h__S_G_0_0_0_2_8_3_7__R_e_l_e_a_s_e__N_o_t_e
This release note describes patch SG0002837 to IRIX 6.4.
1.1 _S_u_p_p_o_r_t_e_d__H_a_r_d_w_a_r_e__P_l_a_t_f_o_r_m_s
This patch contains bug fixes for O200, O2000 and OCTANE.
The software cannot be installed on other configurations.
1.2 _S_u_p_p_o_r_t_e_d__S_o_f_t_w_a_r_e__P_l_a_t_f_o_r_m_s
This patch contains bug fixes for the SCSI driver on a
system running IRIX 6.4. The software cannot be installed
on other configurations.
1.3 _P_a_t_c_h__r_e_l_a_t_i_o_n_s_h_i_p_s
This patch replaces patches SSSSGGGG0000000000001111999922224444,,,, SSSSGGGG0000000000002222000000007777,,,, SSSSGGGG0000000000001111888811117777,,,,
SSSSGGGG0000000000002222111111118888,,,, SSSSGGGG0000000000002222333300005555,,,, SSSSGGGG0000000000002222333355554444,,,, SSSSGGGG0000000000002222555599992222,,,, and SSSSGGGG0000000000002222777799997777....
The following patches (or their successors) are required to
be installed also.
+o SSSSGGGG0000000000002222000066661111 - PCI rollup
+o SSSSGGGG0000000000002222222211111111 - kernel rollup
+o SSSSGGGG0000000000002222000077773333 - tape rollup
+o SSSSGGGG0000000000002222111177773333 - hinv rollup (needed by tape rollup)
1.4 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_8_3_7
This patch contains fixes for the following bugs in IRIX
6.4. Bug numbers from Silicon Graphics bug tracking system
are included for reference.
+o 555577776666000022224444 - Under high load situations, QL scsi with rev.
B bridge chip, on cache line aligned transfers, would
corrupt data written to disk. The correction is to
disable prefetch when performing disk writes using this
configuration.
+o 555577770000888899997777 - A partial fix for this bug has been integrated
into the scsi disk driver. "dksc" now checks the
blocksize on open for removeable media devices. CDROM
filesystems are created with 512 or 2048 byte blocks.
This changes permits the CDROM block size to be changed
by "scsicontrol -b" and have that change recognized by
dksc when the device is next opened. There are other
issues associated with this bug that are not addressed
- 2 -
with this patch.
+o 555555555555333399998888 - QL driver incorrectly handling SRF_NEG_{A}SYNC
bits. SRF_NEG_SYNC and SRF_NEG_ASYNC flags were being
honored for the command which they were being
associated with, but following the next inquiry command
the state would revert. Added code to make the state
persistent.
+o 555555557777888811114444 - The disk driver sense callback routine could
be called before the disk block size has been read. The
fix is to not do the callback if the disk block size
has not been determined.
+o 555555558888111177773333 - QL: Spurious mailbox timeout errors possible.
Mailbox, and command timeouts, need to be at least as
large as the watchdog timer granularity. Otherwise
spurious timeouts may result.
1.4.1 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_7_9_7
+o 555555555555111155555555 - RAIDs with single SP appear as "same" device
to failover code.
+o 555555555555111177773333 - The first open of a spun-down disk drive would
panic the machine.
1.4.2 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_5_9_2
+o 444499991111555566661111 - Following a SCSI bus reset, the current CDROM
block reverts to default. This could cause data
corruption and/or overrun conditions if the expected
blocksize is different from the default. A sense
callback mechanism was implemented to reset the
blocksize back to the expected value prior to issuing
any commands to mounted CDROMs.
+o 555544441111333388885555 - Following receipt of a BUSY status on a
command for a device which had multiple commands
queued, the ISP firmware would double decrement the
device queue counter causing commands to hang within
the ISP.
+o 555544442222222211113333 - Under rare conditions resulting from hardware
failures when multiple commands were queued to a
device, a system panic might result due to a
simplifying assumption.
+o 555544443333000099996666 - During device open, if a device returned BUSY
status to an inquiry command the open would fail. The
fix is to retry the inquiry command in qlinfo for up to
- 3 -
4 seconds.
+o 555544443333888899993333 - Some SCSI-1 devices don't set the sync bit in
their inquiry string even though they are capable of
supporting sync speeds. Added a
ql_force_sync_negotiation DEVICE_ADMIN variable to
force sync negotiation in such cases.
+o 555544444444666699991111 - Following a SCSI bus reprobe which resulted in
deletion of device nodes which were members of failover
groups, an auto-failover operation was initiated. This
behaviour is now configurable, via the variable
fo_auto_failover_on_dev_deletion, and is disabled by
default.
+o 555544444444777733339999 - Following RAID failover, RAID LUN inventory
entries were not being removed when LUNs migrated from
one path to another.
+o 555544445555777722226666 - Mounting of and extraction of data from
iso9660 mounted CDROM devices (Toshiba 12x CDROMs in
particular) was problematic due to a firmware problem
tickled by too many sync negotiation. The workaround is
to inhibit sync negotiation of any kind on CDROM
devices.
+o 555544447777555522226666 - The usage of "primary" and "secondary" paths
following a manual failover switch operation (scsifo
-s) was ambiguous.
+o 555555550000999966661111 - Floppy driver was creating char and block
device special files with wrong permissions. Only char
device special files should be created and should have
world readable/writable permissions.
1.4.3 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_3_5_4
+o 555511115555666633332222 - A couple of race conditions in dsreq.c and
dsglue.c could result in processes hanging due to SCSI
commands appearing to not complete, or panics due to
deallocation of structures by one thread out from under
another.
+o 555500007777555599998888 - For QIC tape devices, the default tape device
and the NR (no-rewind) device were linked to the no-
swap devices. They should be linked to the swap
devices.
+o 555500009999333355554444,,,, 555511115555333311118888,,,, 555511115555333311119999 - Fixes for disk performance
bugs found while tuning large I/O benchmarking system.
- 4 -
+o 555511118888777744446666,,,, 555522226666111177779999,,,, 555522227777000044445555 - Multiple LUN devices created
for same physical floppy device. Problem with TEAC
devices.
+o 555522221111222255557777 - QL driver needs to enforce 32-bit buffer
alignment requirement. A override mechanism is provided
for those apps. requiring arbitrary alignment.
+o 555522226666999911111111 - During boot, occasional SCSI command timeouts
would be observed. This was due to a race condition
where command completion interrupts were not being
honored.
+o 555533331111444400009999 - QL driver was not checking whether an I/O
would span more than the maximum number of allowed
scatter-gather continuation entries, 254. Given 16kB
pages and 5 scatter-gather entries per continuation
element, the large I/O size is (254 * 5 + 2) * 16kB =
20.8 MB.
+o 555533333333888855556666 - QL driver was artificially truncating sense
data returned by device to 18, even though a maximum of
32 was possible.
+o 555533333333777733332222 - The failover infrastructure shouldn't panic
when the maximum number of alternate paths is exceeded.
It's typically a symptom of a SCSI device responding to
all IDs as a result of a SCSI ID conflict.
1.4.4 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_3_0_5
+o 555500008888111122221111 - hinv entries were not being created for some
removable media devices.
+o 555511113333666677779999 - Added support for MSCSI/B+Octane, a board
functionally equivalent to an MSCSI/B but with a fix
for the XBow clock problem.
+o (unreported problem) - The sense of
fo_auto_failover_on_start flipped to 0, so auto
failover does not occur on bootup.
1.4.5 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_1_1_8
+o Unreported bug that caused all commands issued through
the ds (devscsi) driver to be preceded by an inquiry
command.
+o 444422228888111100009999 - SCSI infrastructure didn't allow for general
addition of new tape devices, third party tpsc clones
or other third party device drivers. Mechanism is
- 5 -
documented in master.d/scsi.
+o 444466666666888800008888 - man page for scsifo was missing.
+o 444488887777000022229999 - Wide negotiation by QL driver broke Fujitsu
Diana-1 support. The algorithm for determining when
wide/sync negotiation is done was modified to only do
negotiation if the device supports it, i.e. as reported
by inquiry. Negotiation is still done on a request
sense.
+o 444499999999000000004444 - SN architecture guarantees no contract on the
relative completion of PIOs and DMA writes. A PIO read
of the response head register can therefore pass the
DMA response write potentially resulting in "stale"
responses and panics.
+o 555500002222444455556666 - Race conditions in bus probing code could
cause systems panics if multiple simultaneous bus probe
operations were initiated on the same bus.
+o 555500002222888800009999 - This patch is needed to support the CA-
UniCenter software.
+o 555500003333888800002222 - Following removal of a LUN, failover was
attempted if the LUN was part of a failover group. A
check needs to be made to ensure that a secondary path
exists before attempting failover.
1.4.6 _B_u_g_s__F_i_x_e_d__i_n__p_r_e_d_e_c_e_s_s_o_r_s__t_o__P_a_t_c_h__S_G_0_0_0_2_1_1_8
+o 444444446666777799999999 - "Medium not present" message when changing CDs
on multi-CD inst.
+o 444444448888222233332222 - Misleading "ISP" error message.
+o 444455550000333300005555,,,, 444466660000555566662222 - QL driver panics during high level of
timeout/reset activity. A side effect of this was that
one would see occasional timeouts on other controllers.
+o 444455557777555555559999 - QL driver does not support case of QERR = 1.
+o 444455556666999900009999 - ql does not support abort message
+o 444466660000222277770000 - ql driver artificially throttle device queue
to a maximum of 32 outstanding commands.
+o 444466661111777744445555 - sr_buffer and sr_flags being zeroed by QL
driver.
- 6 -
+o 444466662222444477775555 - ql messes with the config registers.
+o 444466663333000099995555 - DEVICE_ADMIN for ql_sync_period is broken.
+o 444466666666333300000000 - Added support for MSCSI/B.
+o 444466667777555577776666 - QL driver retries selection timeout 3 times -
shouldn't. Leads to excessive bus probe times.
+o 444466667777666677774444 - QL driver needs to print "short" device names
if possible.
+o 444466667777666677775555 - DKSC driver needs to print "short" device
names if possible.
+o 444466668888999911113333 - bogus parity errors reported
+o 444466669999333300003333 - Lack of locking could cause failed mailbox
commands
+o 444477771111555566665555 - Recursion on QL ISR could cause I/O hang
because interrupt thread would deadlock while
attempting to grab a mutex it already held. Also could
cause mailbox command failure.
+o 444477771111999999991111 - failover infrastructure could cause panic if
request for failover was initiated on an invalid
device.
+o 444477772222888866663333 - ql needs to support SRF_AEN_ACK
+o 444477772222888866666666 - MSCSI for OCTANE needs to pcio_priority_set
+o 444477773333777733334444 - SCSI driver on Origin 2000 cannot talk to
Pioneer 1004X CD-jukebox over SCSI
+o 444477775555111111117777 - In some cases, tape compression device were
not being created. This fix is in conjunction with
patch 1903 or its successor.
+o 444477777777000000000000 - QL driver cmd timeout strategy broken for
multi-LUNed devices
+o 444477778888222266669999 - dump causes double panic because of qldump
failure
+o 444477778888999955554444 - QL driver wasn't checking whether a device was
open before removing device nodes during bus reprobe.
+o 444477779999666611114444,,,, 444488880000111199994444 - QL driver wasn't removing all LUN
nodes of a multi-LUNed device, when target was removed.
- 7 -
+o 444488880000111199998888 - Maxoptix optical disks were not recognized as
such.
+o 444488880000555555553333 - K5 was not being recognized as a RAID and
failover infrastructure didn't support failover on K5.
+o 444488884444777700003333 - dksc is subject to recursion potentially
causing stack overflow of HA driver interrupt thread.
+o 444488886666888822223333 - Added a couple of SOP functions needed by
FibreChannel.
+o 444488888888111122225555 - When doing disk writes from a non-cacheline
aligned buffer, data corruption could result if the QL
part was connected via a rev. B bridge. The fix was to
disable bridge prefetch if rev. B bridge was present
and the write buffer was misaligned. As a result of
disabling prefetch, write performance does suffer; up
to 33% in the worst case cof all 4 channels on an MSCSI
running simultaneously.
+o 444488889999777777770000 - The failover infrastructure was not removing
paths before the HWG path for a LUN device was removed.
The result was that duplicate paths would be added,
potentially causing an exhaustion of path slots for a
particular group.
+o 444455556666111100007777 - Probe all luns on Octane.
1.5 _Q_E_R_R__c_o_n_d_i_t_i_o_n_i_n_g__s_c_r_i_p_t
In order for CTQ (command tag queueing) to be permitted, the
value of the SCSI mode parameter, QERR, must be set
appropriately. A script has been provided which will check
the values of the parameter and set it as appropriate. This
script is in ////eeeettttcccc////iiiinnnniiiitttt....dddd////qqqqeeeerrrrrrrr____ppppaaaattttcccchhhh....
Depending on whether a system is in a FAILSAFE environment,
the QERR values of drives and RAID devices may be set
differently. Consequently, when running the qerr_patch
script, an argument must be specified which determines the
FAILSAFE configuration. The allowed values of the argument
are:
+o FFFFSSSSnnnnoooonnnneeee - the current system is not in a FAILSAFE
configuration.
+o FFFFSSSShhhhoooommmmooooggggeeeennnnoooouuuussss - the current system is in a homogenous
FAILSAFE configuration, i.e. Origin-to-Origin, or
Challenge-to-Challenge.
- 8 -
+o FFFFSSSShhhheeeetttteeeerrrrooooggggeeeennnnoooouuuussss - the current system is in a
heterogenous FAILSAFE configuration, i.e. Origin-to-
Challenge.
If the script has not been used to configure QERR values, a
message will be printed during boot time as a reminder to
run the script.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ ./qerr_patch:
+ WARNING: Setting of QErr bit of some drives/RAID Luns may be incorrect
+ and command queueing may therefore be inhibited. Perform the
+ following corrective action:
+ /etc/init.d/qerr_patch FSnone (if not in Failsafe
+ environment)
+ /etc/init.d/qerr_patch FShomogenous (if in homogenous Failsafe
+ environment)
+ /etc/init.d/qerr_patch FSheterogenous (if in heterogenous Failsafe
+ environment)
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Currently, qerr_patch will attempt to reconfigure only SGI
qualified devices; IBM and QUANTUM disk drives, and Clariion
Sauna/Phoenix RAID.
1.6 _S_u_b_s_y_s_t_e_m_s__I_n_c_l_u_d_e_d__i_n__P_a_t_c_h__S_G_0_0_0_2_8_3_7
This patch release includes these subsystems:
+o patchSG0002837.eoe.sw.unix
1.7 _I_n_s_t_a_l_l_a_t_i_o_n__I_n_s_t_r_u_c_t_i_o_n_s
Because you want to install only the patches for problems
you have encountered, patch software is not installed by
default. After reading the descriptions of the bugs fixed
in this patch (see Section 1.4), determine the patches that
meet your specific needs.
If, after reading Sections 1.1 and 1.2 of these release
notes, you are unsure whether your hardware and software
meet the requirements for installing a particular patch, run
_i_n_s_t. The _i_n_s_t program does not allow you to install
patches that are incompatible with your hardware or
software.
Patch software is installed like any other Silicon Graphics
software product. Follow the instructions in your _S_o_f_t_w_a_r_e
- 9 -
_I_n_s_t_a_l_l_a_t_i_o_n _A_d_m_i_n_i_s_t_r_a_t_o_r'_s _G_u_i_d_e to bring up the miniroot
form of the software installation tools.
Follow these steps to select a patch for installation:
1. At the Inst> prompt, type
iiiinnnnssssttttaaaallllllll ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
where _x_x_x_x_x_x_x is the patch number.
2. Initiate the installation sequence. Type
IIIInnnnsssstttt>>>> ggggoooo
3. You may find that two patches have been marked as
incompatible. (The installation tools reject an
installation request if an incompatibility is
detected.) If this occurs, you must deselect one of
the patches.
IIIInnnnsssstttt>>>> kkkkeeeeeeeepppp ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
where _x_x_x_x_x_x_x is the patch number.
4. After completing the installation process, exit the
_i_n_s_t program by typing
IIIInnnnsssstttt>>>> qqqquuuuiiiitttt
1.8 _P_a_t_c_h__R_e_m_o_v_a_l__I_n_s_t_r_u_c_t_i_o_n_s
To remove a patch, use the _v_e_r_s_i_o_n_s _r_e_m_o_v_e command as you
would for any other software subsystem. The removal process
reinstates the original version of software unless you have
specifically removed the patch history from your system.
vvvveeeerrrrssssiiiioooonnnnssss rrrreeeemmmmoooovvvveeee ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
where _x_x_x_x_x_x_x is the patch number.
To keep a patch but increase your disk space, use the
_v_e_r_s_i_o_n_s _r_e_m_o_v_e_h_i_s_t command to remove the patch history.
vvvveeeerrrrssssiiiioooonnnnssss rrrreeeemmmmoooovvvveeeehhhhiiiisssstttt ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
where _x_x_x_x_x_x_x is the patch number.
- 10 -
1.9 _K_n_o_w_n__P_r_o_b_l_e_m_s